Overview

Dataset statistics

Number of variables22
Number of observations124520
Missing cells21016
Missing cells (%)0.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory21.9 MiB
Average record size in memory184.0 B

Variable types

Numeric7
Categorical15

Warnings

ZONIER has a high cardinality: 82 distinct values High cardinality
NB_PIECES is highly correlated with VALEUR_DES_BIENSHigh correlation
VALEUR_DES_BIENS is highly correlated with NB_PIECESHigh correlation
NB is highly correlated with COUT and 1 other fieldsHigh correlation
COUT is highly correlated with NB and 2 other fieldsHigh correlation
PurePremium is highly correlated with COUT and 2 other fieldsHigh correlation
Frequency is highly correlated with NB and 1 other fieldsHigh correlation
AvgClaimAmount is highly correlated with COUT and 1 other fieldsHigh correlation
NB_PIECES is highly correlated with VALEUR_DES_BIENSHigh correlation
VALEUR_DES_BIENS is highly correlated with NB_PIECESHigh correlation
NB is highly correlated with COUT and 3 other fieldsHigh correlation
COUT is highly correlated with NB and 3 other fieldsHigh correlation
PurePremium is highly correlated with NB and 3 other fieldsHigh correlation
Frequency is highly correlated with NB and 3 other fieldsHigh correlation
AvgClaimAmount is highly correlated with NB and 3 other fieldsHigh correlation
NB_PIECES is highly correlated with VALEUR_DES_BIENSHigh correlation
VALEUR_DES_BIENS is highly correlated with NB_PIECESHigh correlation
NB is highly correlated with COUT and 3 other fieldsHigh correlation
COUT is highly correlated with NB and 3 other fieldsHigh correlation
PurePremium is highly correlated with NB and 3 other fieldsHigh correlation
Frequency is highly correlated with NB and 3 other fieldsHigh correlation
AvgClaimAmount is highly correlated with NB and 3 other fieldsHigh correlation
OBJETS_DE_VALEUR is highly correlated with VALEUR_DES_BIENSHigh correlation
Frequency is highly correlated with PurePremiumHigh correlation
VALEUR_DES_BIENS is highly correlated with OBJETS_DE_VALEUR and 1 other fieldsHigh correlation
NBSIN_TYPE1_AN1 is highly correlated with NBSIN_TYPE2_AN1High correlation
COUT is highly correlated with AvgClaimAmount and 1 other fieldsHigh correlation
TYPE_HABITATION is highly correlated with ZONIER and 1 other fieldsHigh correlation
ZONIER is highly correlated with TYPE_HABITATIONHigh correlation
AvgClaimAmount is highly correlated with COUT and 1 other fieldsHigh correlation
SITUATION_JURIDIQUE is highly correlated with VALEUR_DES_BIENS and 1 other fieldsHigh correlation
NBSIN_TYPE1_AN3 is highly correlated with NBSIN_TYPE2_AN3High correlation
NBSIN_TYPE2_AN1 is highly correlated with NBSIN_TYPE1_AN1High correlation
PurePremium is highly correlated with Frequency and 2 other fieldsHigh correlation
NBSIN_TYPE2_AN3 is highly correlated with NBSIN_TYPE1_AN3High correlation
TYPE_HABITATION is highly correlated with ZONIER and 1 other fieldsHigh correlation
ZONIER is highly correlated with TYPE_HABITATIONHigh correlation
NB_PIECES is highly correlated with TYPE_HABITATIONHigh correlation
NB_PIECES has 7458 (6.0%) missing values Missing
NBSIN_TYPE2_AN2 has 13558 (10.9%) missing values Missing
COUT is highly skewed (γ1 = 32.70976088) Skewed
PurePremium is highly skewed (γ1 = 138.335193) Skewed
Frequency is highly skewed (γ1 = 83.65521992) Skewed
AvgClaimAmount is highly skewed (γ1 = 33.53925473) Skewed
id has unique values Unique
VALEUR_DES_BIENS has 6662 (5.4%) zeros Zeros
COUT has 122356 (98.3%) zeros Zeros
PurePremium has 122356 (98.3%) zeros Zeros
Frequency has 122356 (98.3%) zeros Zeros
AvgClaimAmount has 122356 (98.3%) zeros Zeros

Reproduction

Analysis started2021-06-07 19:29:08.793782
Analysis finished2021-06-07 19:29:31.032663
Duration22.24 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

EXPO
Real number (ℝ≥0)

Distinct2053
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.8375858269
Minimum0.0027322397
Maximum1
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.9 MiB
2021-06-07T12:29:31.096136image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0.0027322397
5-th percentile0.125683032
Q10.8164381944
median1
Q31
95-th percentile1
Maximum1
Range0.9972677603
Interquartile range (IQR)0.1835618056

Descriptive statistics

Standard deviation0.29670772
Coefficient of variation (CV)0.354241572
Kurtosis1.004256347
Mean0.8375858269
Median Absolute Deviation (MAD)0
Skewness-1.587807105
Sum104296.1872
Variance0.08803547112
MonotonicityNot monotonic
2021-06-07T12:29:31.183138image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
188070
70.7%
0.0027397256453
 
0.4%
0.334246397303
 
0.2%
0.0027322397261
 
0.2%
0.0849314928209
 
0.2%
0.419178009190
 
0.2%
0.5041093826169
 
0.1%
0.2486338243156
 
0.1%
0.3333332539154
 
0.1%
0.243835507143
 
0.1%
Other values (2043)34412
 
27.6%
ValueCountFrequency (%)
0.0027322397261
0.2%
0.0027397256453
0.4%
0.005464479328
 
< 0.1%
0.005479451365
 
0.1%
0.00819671942
 
< 0.1%
0.00821917565
 
0.1%
0.008219176928
 
< 0.1%
0.010928958759
 
< 0.1%
0.010958900721
 
< 0.1%
0.010958902567
 
0.1%
ValueCountFrequency (%)
188070
70.7%
0.99999998513
 
< 0.1%
0.99999997022
 
< 0.1%
0.999999962739
 
< 0.1%
0.99999994042
 
< 0.1%
0.99999993294
 
< 0.1%
0.99999991062
 
< 0.1%
0.99999988821
 
< 0.1%
0.999999880823
 
< 0.1%
0.99999986591
 
< 0.1%

FORMULE
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.9 MiB
CONFORT
70125 
MEDIUM
31170 
ESSENTIEL
17842 
ALL_INCLUDE
 
5383

Length

Max length11
Median length7
Mean length7.209171217
Min length6

Characters and Unicode

Total characters897686
Distinct characters15
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMEDIUM
2nd rowCONFORT
3rd rowESSENTIEL
4th rowESSENTIEL
5th rowESSENTIEL

Common Values

ValueCountFrequency (%)
CONFORT70125
56.3%
MEDIUM31170
25.0%
ESSENTIEL17842
 
14.3%
ALL_INCLUDE5383
 
4.3%

Length

2021-06-07T12:29:31.339069image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-07T12:29:31.386320image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
confort70125
56.3%
medium31170
25.0%
essentiel17842
 
14.3%
all_include5383
 
4.3%

Most occurring characters

ValueCountFrequency (%)
O140250
15.6%
N93350
10.4%
E90079
10.0%
T87967
9.8%
C75508
8.4%
F70125
7.8%
R70125
7.8%
M62340
6.9%
I54395
 
6.1%
D36553
 
4.1%
Other values (5)116994
13.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter892303
99.4%
Connector Punctuation5383
 
0.6%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
O140250
15.7%
N93350
10.5%
E90079
10.1%
T87967
9.9%
C75508
8.5%
F70125
7.9%
R70125
7.9%
M62340
7.0%
I54395
 
6.1%
D36553
 
4.1%
Other values (4)111611
12.5%
Connector Punctuation
ValueCountFrequency (%)
_5383
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin892303
99.4%
Common5383
 
0.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
O140250
15.7%
N93350
10.5%
E90079
10.1%
T87967
9.9%
C75508
8.5%
F70125
7.9%
R70125
7.9%
M62340
7.0%
I54395
 
6.1%
D36553
 
4.1%
Other values (4)111611
12.5%
Common
ValueCountFrequency (%)
_5383
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII897686
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
O140250
15.6%
N93350
10.4%
E90079
10.0%
T87967
9.8%
C75508
8.4%
F70125
7.8%
R70125
7.8%
M62340
6.9%
I54395
 
6.1%
D36553
 
4.1%
Other values (5)116994
13.0%

TYPE_RESIDENCE
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.9 MiB
PRINCIPALE
104570 
SECONDAIRE
19950 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters1245200
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPRINCIPALE
2nd rowPRINCIPALE
3rd rowPRINCIPALE
4th rowSECONDAIRE
5th rowPRINCIPALE

Common Values

ValueCountFrequency (%)
PRINCIPALE104570
84.0%
SECONDAIRE19950
 
16.0%

Length

2021-06-07T12:29:31.506634image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-07T12:29:31.551075image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
principale104570
84.0%
secondaire19950
 
16.0%

Most occurring characters

ValueCountFrequency (%)
I229090
18.4%
P209140
16.8%
E144470
11.6%
R124520
10.0%
N124520
10.0%
C124520
10.0%
A124520
10.0%
L104570
8.4%
S19950
 
1.6%
O19950
 
1.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter1245200
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
I229090
18.4%
P209140
16.8%
E144470
11.6%
R124520
10.0%
N124520
10.0%
C124520
10.0%
A124520
10.0%
L104570
8.4%
S19950
 
1.6%
O19950
 
1.6%

Most occurring scripts

ValueCountFrequency (%)
Latin1245200
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
I229090
18.4%
P209140
16.8%
E144470
11.6%
R124520
10.0%
N124520
10.0%
C124520
10.0%
A124520
10.0%
L104570
8.4%
S19950
 
1.6%
O19950
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII1245200
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I229090
18.4%
P209140
16.8%
E144470
11.6%
R124520
10.0%
N124520
10.0%
C124520
10.0%
A124520
10.0%
L104570
8.4%
S19950
 
1.6%
O19950
 
1.6%

TYPE_HABITATION
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.9 MiB
MAISON
67545 
APPARTEMENT
56975 

Length

Max length11
Median length6
Mean length8.287785095
Min length6

Characters and Unicode

Total characters1031995
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAPPARTEMENT
2nd rowMAISON
3rd rowAPPARTEMENT
4th rowMAISON
5th rowMAISON

Common Values

ValueCountFrequency (%)
MAISON67545
54.2%
APPARTEMENT56975
45.8%

Length

2021-06-07T12:29:31.668107image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-07T12:29:31.715134image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
maison67545
54.2%
appartement56975
45.8%

Most occurring characters

ValueCountFrequency (%)
A181495
17.6%
M124520
12.1%
N124520
12.1%
P113950
11.0%
T113950
11.0%
E113950
11.0%
I67545
 
6.5%
S67545
 
6.5%
O67545
 
6.5%
R56975
 
5.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter1031995
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A181495
17.6%
M124520
12.1%
N124520
12.1%
P113950
11.0%
T113950
11.0%
E113950
11.0%
I67545
 
6.5%
S67545
 
6.5%
O67545
 
6.5%
R56975
 
5.5%

Most occurring scripts

ValueCountFrequency (%)
Latin1031995
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A181495
17.6%
M124520
12.1%
N124520
12.1%
P113950
11.0%
T113950
11.0%
E113950
11.0%
I67545
 
6.5%
S67545
 
6.5%
O67545
 
6.5%
R56975
 
5.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII1031995
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A181495
17.6%
M124520
12.1%
N124520
12.1%
P113950
11.0%
T113950
11.0%
E113950
11.0%
I67545
 
6.5%
S67545
 
6.5%
O67545
 
6.5%
R56975
 
5.5%

NB_PIECES
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct5
Distinct (%)< 0.1%
Missing7458
Missing (%)6.0%
Memory size1.9 MiB
2.0
48778 
1.0
32199 
3.0
25342 
4.0
7748 
0.0
 
2995

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters351186
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row3.0
3rd row2.0
4th row1.0
5th row2.0

Common Values

ValueCountFrequency (%)
2.048778
39.2%
1.032199
25.9%
3.025342
20.4%
4.07748
 
6.2%
0.02995
 
2.4%
(Missing)7458
 
6.0%

Length

2021-06-07T12:29:31.833251image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-07T12:29:31.878868image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
2.048778
41.7%
1.032199
27.5%
3.025342
21.6%
4.07748
 
6.6%
0.02995
 
2.6%

Most occurring characters

ValueCountFrequency (%)
0120057
34.2%
.117062
33.3%
248778
13.9%
132199
 
9.2%
325342
 
7.2%
47748
 
2.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number234124
66.7%
Other Punctuation117062
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0120057
51.3%
248778
20.8%
132199
 
13.8%
325342
 
10.8%
47748
 
3.3%
Other Punctuation
ValueCountFrequency (%)
.117062
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common351186
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0120057
34.2%
.117062
33.3%
248778
13.9%
132199
 
9.2%
325342
 
7.2%
47748
 
2.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII351186
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0120057
34.2%
.117062
33.3%
248778
13.9%
132199
 
9.2%
325342
 
7.2%
47748
 
2.2%

SITUATION_JURIDIQUE
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.9 MiB
LOCATAIRE
69199 
PROPRIO
55321 

Length

Max length9
Median length9
Mean length8.111451976
Min length7

Characters and Unicode

Total characters1010038
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPROPRIO
2nd rowPROPRIO
3rd rowLOCATAIRE
4th rowLOCATAIRE
5th rowLOCATAIRE

Common Values

ValueCountFrequency (%)
LOCATAIRE69199
55.6%
PROPRIO55321
44.4%

Length

2021-06-07T12:29:32.015518image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-07T12:29:32.067865image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
locataire69199
55.6%
proprio55321
44.4%

Most occurring characters

ValueCountFrequency (%)
R179841
17.8%
O179841
17.8%
A138398
13.7%
I124520
12.3%
P110642
11.0%
L69199
 
6.9%
C69199
 
6.9%
T69199
 
6.9%
E69199
 
6.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter1010038
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
R179841
17.8%
O179841
17.8%
A138398
13.7%
I124520
12.3%
P110642
11.0%
L69199
 
6.9%
C69199
 
6.9%
T69199
 
6.9%
E69199
 
6.9%

Most occurring scripts

ValueCountFrequency (%)
Latin1010038
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
R179841
17.8%
O179841
17.8%
A138398
13.7%
I124520
12.3%
P110642
11.0%
L69199
 
6.9%
C69199
 
6.9%
T69199
 
6.9%
E69199
 
6.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII1010038
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
R179841
17.8%
O179841
17.8%
A138398
13.7%
I124520
12.3%
P110642
11.0%
L69199
 
6.9%
C69199
 
6.9%
T69199
 
6.9%
E69199
 
6.9%

NIVEAU_JURIDIQUE
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.9 MiB
JUR1
122841 
JUR2
 
1679

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters498080
Distinct characters5
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowJUR1
2nd rowJUR1
3rd rowJUR1
4th rowJUR1
5th rowJUR1

Common Values

ValueCountFrequency (%)
JUR1122841
98.7%
JUR21679
 
1.3%

Length

2021-06-07T12:29:32.184382image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-07T12:29:32.463421image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
jur1122841
98.7%
jur21679
 
1.3%

Most occurring characters

ValueCountFrequency (%)
J124520
25.0%
U124520
25.0%
R124520
25.0%
1122841
24.7%
21679
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter373560
75.0%
Decimal Number124520
 
25.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
J124520
33.3%
U124520
33.3%
R124520
33.3%
Decimal Number
ValueCountFrequency (%)
1122841
98.7%
21679
 
1.3%

Most occurring scripts

ValueCountFrequency (%)
Latin373560
75.0%
Common124520
 
25.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
J124520
33.3%
U124520
33.3%
R124520
33.3%
Common
ValueCountFrequency (%)
1122841
98.7%
21679
 
1.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII498080
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
J124520
25.0%
U124520
25.0%
R124520
25.0%
1122841
24.7%
21679
 
0.3%

VALEUR_DES_BIENS
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16636.54031
Minimum0
Maximum100000
Zeros6662
Zeros (%)5.4%
Negative0
Negative (%)0.0%
Memory size1.9 MiB
2021-06-07T12:29:32.505484image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q13500
median9000
Q320000
95-th percentile50000
Maximum100000
Range100000
Interquartile range (IQR)16500

Descriptive statistics

Standard deviation19532.91988
Coefficient of variation (CV)1.174097469
Kurtosis4.291640411
Mean16636.54031
Median Absolute Deviation (MAD)5500
Skewness2.084460644
Sum2071582000
Variance381534958.9
MonotonicityNot monotonic
2021-06-07T12:29:32.568540image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
350041548
33.4%
900027741
22.3%
2000026993
21.7%
350009189
 
7.4%
06662
 
5.4%
500006370
 
5.1%
800005259
 
4.2%
100000758
 
0.6%
ValueCountFrequency (%)
06662
 
5.4%
350041548
33.4%
900027741
22.3%
2000026993
21.7%
350009189
 
7.4%
500006370
 
5.1%
800005259
 
4.2%
100000758
 
0.6%
ValueCountFrequency (%)
100000758
 
0.6%
800005259
 
4.2%
500006370
 
5.1%
350009189
 
7.4%
2000026993
21.7%
900027741
22.3%
350041548
33.4%
06662
 
5.4%

OBJETS_DE_VALEUR
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.9 MiB
NIVEAU_1
116783 
NIVEAU_2
 
7737

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters996160
Distinct characters9
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNIVEAU_1
2nd rowNIVEAU_1
3rd rowNIVEAU_1
4th rowNIVEAU_1
5th rowNIVEAU_1

Common Values

ValueCountFrequency (%)
NIVEAU_1116783
93.8%
NIVEAU_27737
 
6.2%

Length

2021-06-07T12:29:32.703407image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-07T12:29:32.747729image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
niveau_1116783
93.8%
niveau_27737
 
6.2%

Most occurring characters

ValueCountFrequency (%)
N124520
12.5%
I124520
12.5%
V124520
12.5%
E124520
12.5%
A124520
12.5%
U124520
12.5%
_124520
12.5%
1116783
11.7%
27737
 
0.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter747120
75.0%
Connector Punctuation124520
 
12.5%
Decimal Number124520
 
12.5%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N124520
16.7%
I124520
16.7%
V124520
16.7%
E124520
16.7%
A124520
16.7%
U124520
16.7%
Decimal Number
ValueCountFrequency (%)
1116783
93.8%
27737
 
6.2%
Connector Punctuation
ValueCountFrequency (%)
_124520
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin747120
75.0%
Common249040
 
25.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N124520
16.7%
I124520
16.7%
V124520
16.7%
E124520
16.7%
A124520
16.7%
U124520
16.7%
Common
ValueCountFrequency (%)
_124520
50.0%
1116783
46.9%
27737
 
3.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII996160
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N124520
12.5%
I124520
12.5%
V124520
12.5%
E124520
12.5%
A124520
12.5%
U124520
12.5%
_124520
12.5%
1116783
11.7%
27737
 
0.8%

ZONIER
Categorical

HIGH CARDINALITY
HIGH CORRELATION
HIGH CORRELATION

Distinct82
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size1.9 MiB
C20
11657 
C9
 
7716
B32
 
5628
C5
 
5491
C23
 
4958
Other values (77)
89070 

Length

Max length3
Median length3
Mean length2.619707677
Min length2

Characters and Unicode

Total characters326206
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowB40
2nd rowA11
3rd rowB32
4th rowC24
5th rowC9

Common Values

ValueCountFrequency (%)
C2011657
 
9.4%
C97716
 
6.2%
B325628
 
4.5%
C55491
 
4.4%
C234958
 
4.0%
C24862
 
3.9%
B434845
 
3.9%
B404623
 
3.7%
C84358
 
3.5%
C64174
 
3.4%
Other values (72)66208
53.2%

Length

2021-06-07T12:29:32.900679image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
c2011657
 
9.4%
c97716
 
6.2%
b325628
 
4.5%
c55491
 
4.4%
c234958
 
4.0%
c24862
 
3.9%
b434845
 
3.9%
b404623
 
3.7%
c84358
 
3.5%
c64174
 
3.4%
Other values (72)66208
53.2%

Most occurring characters

ValueCountFrequency (%)
C69689
21.4%
248932
15.0%
141080
12.6%
B37526
11.5%
326971
 
8.3%
021142
 
6.5%
920083
 
6.2%
A17305
 
5.3%
417182
 
5.3%
67563
 
2.3%
Other values (3)18733
 
5.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number201686
61.8%
Uppercase Letter124520
38.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
248932
24.3%
141080
20.4%
326971
13.4%
021142
10.5%
920083
10.0%
417182
 
8.5%
67563
 
3.7%
57366
 
3.7%
75906
 
2.9%
85461
 
2.7%
Uppercase Letter
ValueCountFrequency (%)
C69689
56.0%
B37526
30.1%
A17305
 
13.9%

Most occurring scripts

ValueCountFrequency (%)
Common201686
61.8%
Latin124520
38.2%

Most frequent character per script

Common
ValueCountFrequency (%)
248932
24.3%
141080
20.4%
326971
13.4%
021142
10.5%
920083
10.0%
417182
 
8.5%
67563
 
3.7%
57366
 
3.7%
75906
 
2.9%
85461
 
2.7%
Latin
ValueCountFrequency (%)
C69689
56.0%
B37526
30.1%
A17305
 
13.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII326206
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C69689
21.4%
248932
15.0%
141080
12.6%
B37526
11.5%
326971
 
8.3%
021142
 
6.5%
920083
 
6.2%
A17305
 
5.3%
417182
 
5.3%
67563
 
2.3%
Other values (3)18733
 
5.7%

NBSIN_TYPE1_AN1
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.9 MiB
0
116347 
1
 
7508
2
 
604
3
 
61

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters124520
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0116347
93.4%
17508
 
6.0%
2604
 
0.5%
361
 
< 0.1%

Length

2021-06-07T12:29:33.041530image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-07T12:29:33.086333image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0116347
93.4%
17508
 
6.0%
2604
 
0.5%
361
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0116347
93.4%
17508
 
6.0%
2604
 
0.5%
361
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number124520
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0116347
93.4%
17508
 
6.0%
2604
 
0.5%
361
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common124520
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0116347
93.4%
17508
 
6.0%
2604
 
0.5%
361
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII124520
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0116347
93.4%
17508
 
6.0%
2604
 
0.5%
361
 
< 0.1%

NBSIN_TYPE1_AN3
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.9 MiB
0
118006 
1
 
5961
2
 
506
3
 
47

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters124520
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0118006
94.8%
15961
 
4.8%
2506
 
0.4%
347
 
< 0.1%

Length

2021-06-07T12:29:33.213379image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-07T12:29:33.258330image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0118006
94.8%
15961
 
4.8%
2506
 
0.4%
347
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0118006
94.8%
15961
 
4.8%
2506
 
0.4%
347
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number124520
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0118006
94.8%
15961
 
4.8%
2506
 
0.4%
347
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common124520
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0118006
94.8%
15961
 
4.8%
2506
 
0.4%
347
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII124520
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0118006
94.8%
15961
 
4.8%
2506
 
0.4%
347
 
< 0.1%

NBSIN_TYPE2_AN1
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.9 MiB
0
122024 
1
 
2338
2
 
146
3
 
12

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters124520
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0122024
98.0%
12338
 
1.9%
2146
 
0.1%
312
 
< 0.1%

Length

2021-06-07T12:29:33.385749image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-07T12:29:33.430911image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0122024
98.0%
12338
 
1.9%
2146
 
0.1%
312
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0122024
98.0%
12338
 
1.9%
2146
 
0.1%
312
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number124520
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0122024
98.0%
12338
 
1.9%
2146
 
0.1%
312
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common124520
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0122024
98.0%
12338
 
1.9%
2146
 
0.1%
312
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII124520
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0122024
98.0%
12338
 
1.9%
2146
 
0.1%
312
 
< 0.1%

NBSIN_TYPE2_AN2
Categorical

MISSING

Distinct4
Distinct (%)< 0.1%
Missing13558
Missing (%)10.9%
Memory size1.9 MiB
0.0
109205 
1.0
 
1657
2.0
 
93
3.0
 
7

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters332886
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0109205
87.7%
1.01657
 
1.3%
2.093
 
0.1%
3.07
 
< 0.1%
(Missing)13558
 
10.9%

Length

2021-06-07T12:29:33.553471image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-07T12:29:33.597306image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0.0109205
98.4%
1.01657
 
1.5%
2.093
 
0.1%
3.07
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0220167
66.1%
.110962
33.3%
11657
 
0.5%
293
 
< 0.1%
37
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number221924
66.7%
Other Punctuation110962
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0220167
99.2%
11657
 
0.7%
293
 
< 0.1%
37
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
.110962
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common332886
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0220167
66.1%
.110962
33.3%
11657
 
0.5%
293
 
< 0.1%
37
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII332886
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0220167
66.1%
.110962
33.3%
11657
 
0.5%
293
 
< 0.1%
37
 
< 0.1%

NBSIN_TYPE2_AN3
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.9 MiB
0
122727 
1
 
1687
2
 
101
3
 
5

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters124520
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0122727
98.6%
11687
 
1.4%
2101
 
0.1%
35
 
< 0.1%

Length

2021-06-07T12:29:33.724944image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-07T12:29:33.770192image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0122727
98.6%
11687
 
1.4%
2101
 
0.1%
35
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0122727
98.6%
11687
 
1.4%
2101
 
0.1%
35
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number124520
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0122727
98.6%
11687
 
1.4%
2101
 
0.1%
35
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common124520
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0122727
98.6%
11687
 
1.4%
2101
 
0.1%
35
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII124520
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0122727
98.6%
11687
 
1.4%
2101
 
0.1%
35
 
< 0.1%

id
Real number (ℝ≥0)

UNIQUE

Distinct124520
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean200071.9023
Minimum5
Maximum400357
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.9 MiB
2021-06-07T12:29:33.835772image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile19745.55
Q199973.5
median200222
Q3300233.5
95-th percentile380580.1
Maximum400357
Range400352
Interquartile range (IQR)200260

Descriptive statistics

Standard deviation115728.4772
Coefficient of variation (CV)0.5784344319
Kurtosis-1.200387168
Mean200071.9023
Median Absolute Deviation (MAD)100113
Skewness0.0001860442491
Sum2.491295328 × 1010
Variance1.339308043 × 1010
MonotonicityStrictly increasing
2021-06-07T12:29:33.924686image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20491
 
< 0.1%
1460661
 
< 0.1%
294031
 
< 0.1%
3710841
 
< 0.1%
2874491
 
< 0.1%
273521
 
< 0.1%
3297181
 
< 0.1%
27721
 
< 0.1%
130111
 
< 0.1%
2772021
 
< 0.1%
Other values (124510)124510
> 99.9%
ValueCountFrequency (%)
51
< 0.1%
91
< 0.1%
111
< 0.1%
131
< 0.1%
141
< 0.1%
151
< 0.1%
201
< 0.1%
211
< 0.1%
281
< 0.1%
301
< 0.1%
ValueCountFrequency (%)
4003571
< 0.1%
4003561
< 0.1%
4003501
< 0.1%
4003481
< 0.1%
4003451
< 0.1%
4003431
< 0.1%
4003421
< 0.1%
4003361
< 0.1%
4003351
< 0.1%
4003321
< 0.1%

ANNEE
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.9 MiB
2016
43485 
2017
41624 
2018
39411 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters498080
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2017
2nd row2018
3rd row2017
4th row2018
5th row2017

Common Values

ValueCountFrequency (%)
201643485
34.9%
201741624
33.4%
201839411
31.7%

Length

2021-06-07T12:29:34.087525image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-07T12:29:34.131874image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
201643485
34.9%
201741624
33.4%
201839411
31.7%

Most occurring characters

ValueCountFrequency (%)
2124520
25.0%
0124520
25.0%
1124520
25.0%
643485
 
8.7%
741624
 
8.4%
839411
 
7.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number498080
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2124520
25.0%
0124520
25.0%
1124520
25.0%
643485
 
8.7%
741624
 
8.4%
839411
 
7.9%

Most occurring scripts

ValueCountFrequency (%)
Common498080
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2124520
25.0%
0124520
25.0%
1124520
25.0%
643485
 
8.7%
741624
 
8.4%
839411
 
7.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII498080
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2124520
25.0%
0124520
25.0%
1124520
25.0%
643485
 
8.7%
741624
 
8.4%
839411
 
7.9%

NB
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.9 MiB
0.0
122356 
1.0
 
2098
2.0
 
65
3.0
 
1

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters373560
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0122356
98.3%
1.02098
 
1.7%
2.065
 
0.1%
3.01
 
< 0.1%

Length

2021-06-07T12:29:34.256732image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-07T12:29:34.302559image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0.0122356
98.3%
1.02098
 
1.7%
2.065
 
0.1%
3.01
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0246876
66.1%
.124520
33.3%
12098
 
0.6%
265
 
< 0.1%
31
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number249040
66.7%
Other Punctuation124520
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0246876
99.1%
12098
 
0.8%
265
 
< 0.1%
31
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
.124520
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common373560
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0246876
66.1%
.124520
33.3%
12098
 
0.6%
265
 
< 0.1%
31
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII373560
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0246876
66.1%
.124520
33.3%
12098
 
0.6%
265
 
< 0.1%
31
 
< 0.1%

COUT
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED
ZEROS

Distinct1897
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean24.98666351
Minimum0
Maximum33089.12
Zeros122356
Zeros (%)98.3%
Negative0
Negative (%)0.0%
Memory size1.9 MiB
2021-06-07T12:29:34.367708image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum33089.12
Range33089.12
Interquartile range (IQR)0

Descriptive statistics

Standard deviation376.7359879
Coefficient of variation (CV)15.07748275
Kurtosis1608.681443
Mean24.98666351
Median Absolute Deviation (MAD)0
Skewness32.70976088
Sum3111339.34
Variance141930.0046
MonotonicityNot monotonic
2021-06-07T12:29:34.452371image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0122356
98.3%
24.5249
 
< 0.1%
24.2129
 
< 0.1%
24.3516
 
< 0.1%
286.446
 
< 0.1%
235.356
 
< 0.1%
52.425
 
< 0.1%
91.965
 
< 0.1%
76.084
 
< 0.1%
692.594
 
< 0.1%
Other values (1887)2040
 
1.6%
ValueCountFrequency (%)
0122356
98.3%
5.071
 
< 0.1%
5.111
 
< 0.1%
14.21
 
< 0.1%
14.351
 
< 0.1%
15.131
 
< 0.1%
15.222
 
< 0.1%
16.711
 
< 0.1%
20.051
 
< 0.1%
20.183
 
< 0.1%
ValueCountFrequency (%)
33089.121
< 0.1%
28743.21
< 0.1%
23472.511
< 0.1%
21466.471
< 0.1%
19999.821
< 0.1%
19868.261
< 0.1%
18795.561
< 0.1%
18358.721
< 0.1%
16703.331
< 0.1%
15116.451
< 0.1%

PurePremium
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED
ZEROS

Distinct1968
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32.12322995
Minimum0
Maximum185244.9171
Zeros122356
Zeros (%)98.3%
Negative0
Negative (%)0.0%
Memory size1.9 MiB
2021-06-07T12:29:34.541621image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum185244.9171
Range185244.9171
Interquartile range (IQR)0

Descriptive statistics

Standard deviation749.479255
Coefficient of variation (CV)23.33137907
Kurtosis30850.66021
Mean32.12322995
Median Absolute Deviation (MAD)0
Skewness138.335193
Sum3999984.594
Variance561719.1537
MonotonicityNot monotonic
2021-06-07T12:29:34.627656image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0122356
98.3%
24.5237
 
< 0.1%
24.2122
 
< 0.1%
24.3514
 
< 0.1%
286.445
 
< 0.1%
52.424
 
< 0.1%
105.924
 
< 0.1%
684.984
 
< 0.1%
695.164
 
< 0.1%
86.234
 
< 0.1%
Other values (1958)2066
 
1.7%
ValueCountFrequency (%)
0122356
98.3%
5.071
 
< 0.1%
5.111
 
< 0.1%
14.21
 
< 0.1%
14.351
 
< 0.1%
15.131
 
< 0.1%
15.221
 
< 0.1%
16.711
 
< 0.1%
20.183
 
< 0.1%
20.201091261
 
< 0.1%
ValueCountFrequency (%)
185244.91711
< 0.1%
61212.989641
< 0.1%
53526.920131
< 0.1%
44106.916541
< 0.1%
33089.121
< 0.1%
31979.624621
< 0.1%
30427.130231
< 0.1%
29755.833561
< 0.1%
28027.424151
< 0.1%
27225.206191
< 0.1%

Frequency
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED
ZEROS

Distinct323
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.02258020873
Minimum0
Maximum52.28574944
Zeros122356
Zeros (%)98.3%
Negative0
Negative (%)0.0%
Memory size1.9 MiB
2021-06-07T12:29:34.718249image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum52.28574944
Range52.28574944
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.2819514474
Coefficient of variation (CV)12.48666258
Kurtosis12466.67838
Mean0.02258020873
Median Absolute Deviation (MAD)0
Skewness83.65521992
Sum2811.68759
Variance0.07949661872
MonotonicityNot monotonic
2021-06-07T12:29:34.808772image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0122356
98.3%
11697
 
1.4%
254
 
< 0.1%
3.3796305394
 
< 0.1%
1.1442008724
 
< 0.1%
1.0310735274
 
< 0.1%
1.0578037514
 
< 0.1%
2.3164567523
 
< 0.1%
1.1850653043
 
< 0.1%
1.710281253
 
< 0.1%
Other values (313)388
 
0.3%
ValueCountFrequency (%)
0122356
98.3%
11697
 
1.4%
1.0000000151
 
< 0.1%
1.0000000891
 
< 0.1%
1.0000001813
 
< 0.1%
1.0000002382
 
< 0.1%
1.0027474342
 
< 0.1%
1.0054945792
 
< 0.1%
1.0055100062
 
< 0.1%
1.0082645742
 
< 0.1%
ValueCountFrequency (%)
52.285749441
< 0.1%
33.272728391
< 0.1%
30.416677891
< 0.1%
17.380961071
< 0.1%
14.600004881
< 0.1%
14.038465761
< 0.1%
12.166668961
< 0.1%
11.77419551
< 0.1%
11.090914991
< 0.1%
10.764711521
< 0.1%

AvgClaimAmount
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED
ZEROS

Distinct1894
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean24.35936686
Minimum0
Maximum33089.12
Zeros122356
Zeros (%)98.3%
Negative0
Negative (%)0.0%
Memory size1.9 MiB
2021-06-07T12:29:34.896786image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum33089.12
Range33089.12
Interquartile range (IQR)0

Descriptive statistics

Standard deviation369.0591316
Coefficient of variation (CV)15.15060443
Kurtosis1705.598269
Mean24.35936686
Median Absolute Deviation (MAD)0
Skewness33.53925473
Sum3033228.362
Variance136204.6426
MonotonicityNot monotonic
2021-06-07T12:29:34.981615image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0122356
98.3%
24.5250
 
< 0.1%
24.2129
 
< 0.1%
24.3517
 
< 0.1%
235.356
 
< 0.1%
286.446
 
< 0.1%
52.425
 
< 0.1%
91.965
 
< 0.1%
54.484
 
< 0.1%
692.594
 
< 0.1%
Other values (1884)2038
 
1.6%
ValueCountFrequency (%)
0122356
98.3%
5.071
 
< 0.1%
5.111
 
< 0.1%
14.21
 
< 0.1%
14.351
 
< 0.1%
15.131
 
< 0.1%
15.222
 
< 0.1%
16.711
 
< 0.1%
20.051
 
< 0.1%
20.183
 
< 0.1%
ValueCountFrequency (%)
33089.121
< 0.1%
28743.21
< 0.1%
23472.511
< 0.1%
21466.471
< 0.1%
19999.821
< 0.1%
19868.261
< 0.1%
18795.561
< 0.1%
18358.721
< 0.1%
16703.331
< 0.1%
15116.451
< 0.1%

Interactions

2021-06-07T12:29:25.104212image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:25.191223image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:25.275883image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:25.364339image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:25.449407image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:25.536977image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:25.619797image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:25.706162image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:25.789446image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:25.874300image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:25.963354image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:26.047401image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:26.348378image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:26.430719image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:26.517992image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:26.609635image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:26.700624image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:26.797302image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:26.890064image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:26.993456image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:27.090287image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:27.188453image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:27.278183image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:27.368848image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:27.463465image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:27.554152image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:27.647571image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:27.735872image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:27.828870image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:27.923136image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:28.019853image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:28.118581image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:28.213192image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:28.310779image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:28.403095image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:28.498723image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:28.586237image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:28.675175image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:28.767960image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:28.856626image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:28.955977image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:29.045358image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:29.138620image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:29.233417image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:29.328789image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:29.428582image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:29.524550image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:29.622924image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-07T12:29:29.715168image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Correlations

2021-06-07T12:29:35.071364image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-06-07T12:29:35.218143image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-06-07T12:29:35.364446image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-06-07T12:29:35.520013image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-06-07T12:29:35.690790image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-06-07T12:29:29.915527image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
A simple visualization of nullity by column.
2021-06-07T12:29:30.361746image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2021-06-07T12:29:30.747227image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2021-06-07T12:29:30.862611image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

EXPOFORMULETYPE_RESIDENCETYPE_HABITATIONNB_PIECESSITUATION_JURIDIQUENIVEAU_JURIDIQUEVALEUR_DES_BIENSOBJETS_DE_VALEURZONIERNBSIN_TYPE1_AN1NBSIN_TYPE1_AN3NBSIN_TYPE2_AN1NBSIN_TYPE2_AN2NBSIN_TYPE2_AN3idANNEENBCOUTPurePremiumFrequencyAvgClaimAmount
01.000000MEDIUMPRINCIPALEAPPARTEMENT1.0PROPRIOJUR13500.0NIVEAU_1B400000.00520170.00.000.000.00.00
10.824657CONFORTPRINCIPALEMAISONNaNPROPRIOJUR10.0NIVEAU_1A110000.00920180.00.000.000.00.00
21.000000ESSENTIELPRINCIPALEAPPARTEMENT3.0LOCATAIREJUR135000.0NIVEAU_1B320000.001120170.00.000.000.00.00
31.000000ESSENTIELSECONDAIREMAISON2.0LOCATAIREJUR19000.0NIVEAU_1C240000.001320180.00.000.000.00.00
41.000000ESSENTIELPRINCIPALEMAISON1.0LOCATAIREJUR120000.0NIVEAU_1C90000.001420170.00.000.000.00.00
51.000000CONFORTPRINCIPALEMAISON2.0LOCATAIREJUR19000.0NIVEAU_1C200000.001520171.0521.43521.431.0521.43
61.000000ESSENTIELPRINCIPALEMAISON2.0LOCATAIREJUR120000.0NIVEAU_1C21000.002020180.00.000.000.00.00
70.986301CONFORTPRINCIPALEMAISON2.0LOCATAIREJUR19000.0NIVEAU_1C80000.002120160.00.000.000.00.00
81.000000MEDIUMPRINCIPALEMAISON1.0PROPRIOJUR13500.0NIVEAU_1A90000.002820180.00.000.000.00.00
91.000000CONFORTSECONDAIREAPPARTEMENT3.0LOCATAIREJUR10.0NIVEAU_1A70000.003020160.00.000.000.00.00

Last rows

EXPOFORMULETYPE_RESIDENCETYPE_HABITATIONNB_PIECESSITUATION_JURIDIQUENIVEAU_JURIDIQUEVALEUR_DES_BIENSOBJETS_DE_VALEURZONIERNBSIN_TYPE1_AN1NBSIN_TYPE1_AN3NBSIN_TYPE2_AN1NBSIN_TYPE2_AN2NBSIN_TYPE2_AN3idANNEENBCOUTPurePremiumFrequencyAvgClaimAmount
1245100.504109MEDIUMPRINCIPALEMAISON1.0PROPRIOJUR23500.0NIVEAU_1B260000.0040033220180.00.00.00.00.0
1245110.082192CONFORTPRINCIPALEMAISON1.0PROPRIOJUR13500.0NIVEAU_1C200000.0040033520180.00.00.00.00.0
1245120.093151CONFORTPRINCIPALEMAISON1.0PROPRIOJUR19000.0NIVEAU_1B26000NaN040033620180.00.00.00.00.0
1245131.000000ESSENTIELPRINCIPALEAPPARTEMENT3.0LOCATAIREJUR150000.0NIVEAU_1B170000.0040034220170.00.00.00.00.0
1245140.671233ESSENTIELPRINCIPALEMAISON1.0PROPRIOJUR13500.0NIVEAU_1B320000.0040034320160.00.00.00.00.0
1245151.000000CONFORTPRINCIPALEMAISON1.0PROPRIOJUR13500.0NIVEAU_1C201000.0040034520180.00.00.00.00.0
1245161.000000MEDIUMPRINCIPALEMAISON2.0PROPRIOJUR13500.0NIVEAU_1A20000.0040034820170.00.00.00.00.0
1245170.273973CONFORTSECONDAIREMAISON1.0PROPRIOJUR13500.0NIVEAU_1C160000.0040035020180.00.00.00.00.0
1245181.000000MEDIUMPRINCIPALEMAISON2.0LOCATAIREJUR13500.0NIVEAU_1C90000.0040035620160.00.00.00.00.0
1245190.939726CONFORTPRINCIPALEMAISON2.0PROPRIOJUR19000.0NIVEAU_1C50000.0040035720180.00.00.00.00.0